9 research outputs found

    Lost in Time: Temporal Analytics for Long-Term Video Surveillance

    Full text link
    Video surveillance is a well researched area of study with substantial work done in the aspects of object detection, tracking and behavior analysis. With the abundance of video data captured over a long period of time, we can understand patterns in human behavior and scene dynamics through data-driven temporal analytics. In this work, we propose two schemes to perform descriptive and predictive analytics on long-term video surveillance data. We generate heatmap and footmap visualizations to describe spatially pooled trajectory patterns with respect to time and location. We also present two approaches for anomaly prediction at the day-level granularity: a trajectory-based statistical approach, and a time-series based approach. Experimentation with one year data from a single camera demonstrates the ability to uncover interesting insights about the scene and to predict anomalies reasonably well.Comment: To Appear in Springer LNE

    Enriched Long-term Recurrent Convolutional Network for Facial Micro-Expression Recognition

    Full text link
    Facial micro-expression (ME) recognition has posed a huge challenge to researchers for its subtlety in motion and limited databases. Recently, handcrafted techniques have achieved superior performance in micro-expression recognition but at the cost of domain specificity and cumbersome parametric tunings. In this paper, we propose an Enriched Long-term Recurrent Convolutional Network (ELRCN) that first encodes each micro-expression frame into a feature vector through CNN module(s), then predicts the micro-expression by passing the feature vector through a Long Short-term Memory (LSTM) module. The framework contains two different network variants: (1) Channel-wise stacking of input data for spatial enrichment, (2) Feature-wise stacking of features for temporal enrichment. We demonstrate that the proposed approach is able to achieve reasonably good performance, without data augmentation. In addition, we also present ablation studies conducted on the framework and visualizations of what CNN "sees" when predicting the micro-expression classes.Comment: Published in Micro-Expression Grand Challenge 2018, Workshop of 13th IEEE Facial & Gesture 201

    Shallow Triple Stream Three-dimensional CNN (STSTNet) for Micro-expression Recognition

    Full text link
    In the recent year, state-of-the-art for facial micro-expression recognition have been significantly advanced by deep neural networks. The robustness of deep learning has yielded promising performance beyond that of traditional handcrafted approaches. Most works in literature emphasized on increasing the depth of networks and employing highly complex objective functions to learn more features. In this paper, we design a Shallow Triple Stream Three-dimensional CNN (STSTNet) that is computationally light whilst capable of extracting discriminative high level features and details of micro-expressions. The network learns from three optical flow features (i.e., optical strain, horizontal and vertical optical flow fields) computed based on the onset and apex frames of each video. Our experimental results demonstrate the effectiveness of the proposed STSTNet, which obtained an unweighted average recall rate of 0.7605 and unweighted F1-score of 0.7353 on the composite database consisting of 442 samples from the SMIC, CASME II and SAMM databases.Comment: 5 pages, 1 figure, Accepted and published in IEEE FG 201

    Where Is the Emotion? Dissecting A Multi-Gap Network for Image Emotion Classification

    No full text
    Image emotion recognition has become an increasingly popular research domain in the area of image processing and affective computing. Despite fast-improving classification performance in this task, the understanding and interpretability of its performance are still lacking as there are limited studies on which part of an image would invoke a particular emotion. In this work, we propose a Multi-GAP deep neural network for image emotion classification, which is extensible to accommodate multiple streams of information. We also incorporate feature dependency into our network blocks by adding a bidirectional GRU network to learn transitional features. We report extensive results on the variants of our proposed network and provide valuable perspectives into the class-activated regions via Grad-CAM, and network depth contributions by truncation strateg

    LiteEmo:Lightweight Deep Neural Networks for Image Emotion Recognition

    No full text
    Psychology studies have shown that an image can invoke various emotions, depending on the visual features as well as semantic content of the image. Ability to identify image emotion can be very useful for many applications, including image retrieval and aesthetics prediction. Notably, most of the existing deep learning-based emotion recognition models do not capitalize on additional semantics or contextual information and are computational expensive. Inspired to overcome these limitations, we proposed a lightweight multi-stream deep network that concatenates several MobileNet networks for performing image emotion analysis. Each stream in the multi-stream deep network represents the core emotion recognition, object recognition and image category recognition models respectively. Experimental results demonstrate the effectiveness of the additional contextual information in producing comparable performance as the state-of-the-art emotion models, but with lesser parameters, thus improving its practicality

    Dual-stream Shallow Networks for Facial Micro-expression Recognition

    No full text
    Micro-expressions are spontaneous, brief and subtle facial muscle movements that exposes underlying emotions. Motivated by recent exploits into deep learning for micro-expression analysis, we propose a lightweight dual-stream shallow network in the form of a pair of truncated CNNs with heterogeneous input features. The merging of the convolutional features allows for discriminative learning of micro-expression classes stemming from both streams. Using activation heatmaps, we further demonstrate that salient facial areas are well emphasized, and correspond closely to relevant action units belonging to emotion classes. We empirically validate the proposed network on three benchmark databases, obtaining state-of-the-art performance on the CASME II and SAMM while remaining competitive on the SMIC. Further observations point towards the sufficiency of utilizing shallower deep networks for micro-expression recognitio

    Needle in a Haystack: Spotting and recognising micro-expressions “in the wild”

    No full text
    Computational research on facial micro-expressions has long focused on videos captured under constrained laboratory conditions due to the challenging elicitation process and limited samples that are publicly available. Moreover, processing micro-expressions is extremely challenging under unconstrained scenarios. This paper introduces, for the first time, a completely automatic micro-expression “spot-and-recognize” framework that is performed on in-the-wild videos, such as in poker games and political interviews. The proposed method first spots the apex frame from a video by handling head movements and unconscious actions which are typically larger in motion intensity, with alignment employed to enforce a canonical face pose. Optical flow guided features play a central role in our method: they can robustly identify the location of the apex frame, and are used to learn a shallow neural network model for emotion classification. Experimental results demonstrate the feasibility of the proposed methodology, establishing good baselines for both spotting and recognition tasks – ASR of 0.33 and F1-score of 0.6758 respectively on the MEVIEW micro-expression database. In addition, we present comprehensive qualitative and quantitative analyses to further show the effectiveness of the proposed framework, with new suggestion for an appropriate evaluation protocol. In a nutshell, this paper provides a new benchmark for apex spotting and emotion recognition in an in-the-wild setting

    Revealing the invisible with model and data shrinking for composite-database micro-expression recognition

    No full text
    Abstract Composite-database micro-expression recognition is attracting increasing attention as it is more practical for real-world applications. Though the composite database provides more sample diversity for learning good representation models, the important subtle dynamics are prone to disappearing in the domain shift such that the models greatly degrade their performance, especially for deep models. In this article, we analyze the influence of learning complexity, including input complexity and model complexity, and discover that the lower-resolution input data and shallower-architecture model are helpful to ease the degradation of deep models in composite-database task. Based on this, we propose a recurrent convolutional network (RCN) to explore the shallower-architecture and lower-resolution input data, shrinking model and input complexities simultaneously. Furthermore, we develop three parameter-free modules (i.e., wide expansion, shortcut connection and attention unit) to integrate with RCN without increasing any learnable parameters. These three modules can enhance the representation ability in various perspectives while preserving not-very-deep architecture for lower-resolution data. Besides, three modules can further be combined by an automatic strategy (a neural architecture search strategy) and the searched architecture becomes more robust. Extensive experiments on the MEGC2019 dataset (composited of existing SMIC, CASME II and SAMM datasets) have verified the influence of learning complexity and shown that RCNs with three modules and the searched combination outperform the state-of-the-art approaches

    On vehicle state tracking for long-Term carpark video surveillance

    No full text
    corecore